AITopics | imitation policy

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(4 more...)

Industry: Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsDec-23-2025, 20:57:04 GMT

Invariant Causal Imitation Learning for Generalizable Policies

Consider learning an imitation policy on the basis of demonstrated behavior from multiple environments, with an eye towards deployment in an unseen environment. Since the observable features from each setting may be different, directly learning individual policies as mappings from features to actions is prone to spurious correlations---and may not generalize well. However, the expert's policy is often a function of a shared latent structure underlying those observable features that is invariant across settings. By leveraging data from multiple environments, we propose Invariant Causal Imitation Learning (ICIL), a novel technique in which we learn a feature representation that is invariant across domains, on the basis of which we learn an imitation policy that matches expert behavior. To cope with transition dynamics mismatch, ICIL learns a shared representation of causal features (for all training environments), that is disentangled from the specific representations of noise variables (for each of those environments). Moreover, to ensure that the learned policy matches the observation distribution of the expert's policy, ICIL estimates the energy of the expert's observations and uses a regularization term that minimizes the imitator policy's next state energy. Experimentally, we compare our methods against several benchmarks in control and healthcare tasks and show its effectiveness in learning imitation policies capable of generalizing to unseen environments.

generalizable policy, invariant causal imitation learning, name change, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Neural Information Processing SystemsOct-2-2025, 19:28:11 GMT

204904e461002b28511d5880e1c36a0f-Paper.pdf

latexit sha1, machine learning, reinforcement learning, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(2 more...)

Genre: Research Report (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Robots (0.72)

Neural Information Processing SystemsOct-2-2025, 08:08:35 GMT

Fighting Copycat Agents in Behavioral Cloning from Observation Histories

Imitation learning trains policies to map from input observations to the actions that an expert would choose.

information, machine learning, natural language, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(4 more...)

Industry: Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsOct-9-2024, 18:03:39 GMT

Invariant Causal Imitation Learning for Generalizable Policies

generalizable policy, imitation policy, invariant causal imitation learning, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceSep-29-2024

Learning Wheelchair Tennis Navigation from Broadcast Videos with Domain Knowledge Transfer and Diffusion Motion Planning

Wu, Zixuan, Zaidi, Zulfiqar, Patil, Adithya, Xiao, Qingyu, Gombolay, Matthew

In this paper, we propose a novel and generalizable zero-shot knowledge transfer framework that distills expert sports navigation strategies from web videos into robotic systems with adversarial constraints and out-of-distribution image trajectories. Our pipeline enables diffusion-based imitation learning by reconstructing the full 3D task space from multiple partial views, warping it into 2D image space, closing the planning loop within this 2D space, and transfer constrained motion of interest back to task space. Additionally, we demonstrate that the learned policy can serve as a local planner in conjunction with position control. We apply this framework in the wheelchair tennis navigation problem to guide the wheelchair into the ball-hitting region. Our pipeline achieves a navigation success rate of 97.67% in reaching real-world recorded tennis ball trajectories with a physical robot wheelchair, and achieve a success rate of 68.49% in a real-world, real-time experiment on a full-sized tennis court.

trajectory, video, wheelchair, (17 more...)

2409.19771

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > South Korea (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Wang, Jiashun, Hodgins, Jessica, Won, Jungdam

Strategy and Skill Learning for Physics-based Table Tennis Animation

arXiv.org Artificial IntelligenceJul-23-2024

Recent advancements in physics-based character animation leverage deep learning to generate agile and natural motion, enabling characters to execute movements such as backflips, boxing, and tennis. However, reproducing the selection and use of diverse motor skills in dynamic environments to solve complex tasks, as humans do, still remains a challenge. We present a strategy and skill learning approach for physics-based table tennis animation. Our method addresses the issue of mode collapse, where the characters do not fully utilize the motor skills they need to perform to execute complex tasks. More specifically, we demonstrate a hierarchical control system for diversified skill learning and a strategy learning framework for effective decision-making. We showcase the efficacy of our method through comparative analysis with state-of-the-art methods, demonstrating its capabilities in executing various skills for table tennis. Our strategy learning framework is validated through both agent-agent interaction and human-agent interaction in Virtual Reality, handling both competitive and cooperative tasks.

artificial intelligence, controller, machine learning, (14 more...)

2407.1621

Country:

North America > United States > Colorado > Denver County > Denver (0.05)
Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Ciftci, Yusuf Umut, Feng, Zeyuan, Bansal, Somil

SAFE-GIL: SAFEty Guided Imitation Learning

arXiv.org Artificial IntelligenceApr-8-2024

Behavior Cloning is a popular approach to Imitation Learning, in which a robot observes an expert supervisor and learns a control policy. However, behavior cloning suffers from the "compounding error" problem - the policy errors compound as it deviates from the expert demonstrations and might lead to catastrophic system failures, limiting its use in safety-critical applications. On-policy data aggregation methods are able to address this issue at the cost of rolling out and repeated training of the imitation policy, which can be tedious and computationally prohibitive. We propose SAFE-GIL, an off-policy behavior cloning method that guides the expert via adversarial disturbance during data collection. The algorithm abstracts the imitation error as an adversarial disturbance in the system dynamics, injects it during data collection to expose the expert to safety critical states, and collects corrective actions. Our method biases training to more closely replicate expert behavior in safety-critical states and allows more variance in less critical states. We compare our method with several behavior cloning techniques and DAgger on autonomous navigation and autonomous taxiing tasks and show higher task success and safety, especially in low data regimes where the likelihood of error is higher, at a slight drop in the performance.

demonstration, disturbance, imitation policy, (13 more...)

2404.05249

Country:

North America > United States > California (0.14)
Europe > Italy > Sardinia (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Air (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceFeb-29-2024

DIGIC: Domain Generalizable Imitation Learning by Causal Discovery

Chen, Yang, Liang, Yitao, Lin, Zhouchen

Causality has been combined with machine learning to produce robust representations for domain generalization. Most existing methods of this type require massive data from multiple domains to identify causal features by cross-domain variations, which can be expensive or even infeasible and may lead to misidentification in some cases. In this work, we make a different attempt by leveraging the demonstration data distribution to discover the causal features for a domain generalizable policy. We design a novel framework, called DIGIC, to identify the causal features by finding the direct cause of the expert action from the demonstration data distribution via causal discovery. Our framework can achieve domain generalizable imitation learning with only single-domain data and serve as a complement for cross-domain variation-based methods under non-structural assumptions on the underlying causal models. Our empirical study in various control tasks shows that the proposed framework evidently improves the domain generalization performance and has comparable performance to the expert in the original domain simultaneously.

data distribution, generalization, imitation, (12 more...)

2402.1891

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Bica, Ioana, Jarrett, Daniel, van der Schaar, Mihaela

Invariant Causal Imitation Learning for Generalizable Policies

arXiv.org Machine LearningNov-2-2023

Consider learning an imitation policy on the basis of demonstrated behavior from multiple environments, with an eye towards deployment in an unseen environment. Since the observable features from each setting may be different, directly learning individual policies as mappings from features to actions is prone to spurious correlations -- and may not generalize well. However, the expert's policy is often a function of a shared latent structure underlying those observable features that is invariant across settings. By leveraging data from multiple environments, we propose Invariant Causal Imitation Learning (ICIL), a novel technique in which we learn a feature representation that is invariant across domains, on the basis of which we learn an imitation policy that matches expert behavior. To cope with transition dynamics mismatch, ICIL learns a shared representation of causal features (for all training environments), that is disentangled from the specific representations of noise variables (for each of those environments). Moreover, to ensure that the learned policy matches the observation distribution of the expert's policy, ICIL estimates the energy of the expert's observations and uses a regularization term that minimizes the imitator policy's next state energy. Experimentally, we compare our methods against several benchmarks in control and healthcare tasks and show its effectiveness in learning imitation policies capable of generalizing to unseen environments.

latexit sha1, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2311.01489

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)